Search results for "Language model"
showing 10 items of 17 documents
Combining Machine Translated Sentence Chunks from Multiple MT Systems
2018
This paper presents a hybrid machine translation (HMT) system that pursues syntactic analysis to acquire phrases of source sentences, translates the phrases using multiple online machine translation (MT) system application program interfaces (APIs) and generates output by combining translated chunks to obtain the best possible translation. The aim of this study is to improve translation quality of English – Latvian texts over each of the individual MT APIs. The selection of the best translation hypothesis is done by calculating the perplexity for each hypothesis using an n-gram language model. The result is a phrase-based multi-system machine translation system that allows to improve MT out…
Ordinal mind change complexity of language identification
1997
The approach of ordinal mind change complexity, introduced by Freivalds and Smith, uses constructive ordinals to bound the number of mind changes made by a learning machine. This approach provides a measure of the extent to which a learning machine has to keep revising its estimate of the number of mind changes it will make before converging to a correct hypothesis for languages in the class being learned. Recently, this measure, which also suggests the difficulty of learning a class of languages, has been used to analyze the learnability of rich classes of languages. Jain and Sharma have shown that the ordinal mind change complexity for identification from positive data of languages formed…
Resolving ambiguities in a grounded human-robot interaction
2009
In this paper we propose a trainable system that learns grounded language models from examples with a minimum of user intervention and without feedback. We have focused on the acquisition of grounded meanings of spatial and adjective/noun terms. The system has been used to understand and subsequently to generate appropriate natural language descriptions of real objects and to engage in verbal interactions with a human partner. We have also addressed the problem of resolving eventual ambiguities arising during verbal interaction through an information theoretic approach.
A Sub-Symbolic Approach to Word Modelling for Domain Specific Speech Recognition
2006
In this work a sub-symbolic technique for automatic, data driven language models construction is presented. Such a technique can be used to arrange a language-modelling module, which can be easily integrated in existing speech recognition architectures, such as the well-found HTK architecture. The proposed technique takes advantages from both the traditional LSA approach and from a novel application of a probability space metric known as "Hellinger's distance". Experimental trials are also presented, in order to validate the proposed approach.
The Idea Machine: LLM-based Expansion, Rewriting, Combination, and Suggestion of Ideas
2022
We introduce the Idea Machine, a creativity support tool that leverages large language models (LLMs) to empower people engaged in idea generation tasks. The tool includes a number of affordances that can be used to enable various levels of automation and intelligent support. Each idea entered into the system can be expanded, rewritten, or combined with other ideas or concepts. An idea suggestion mode can also be enabled to make the system proactively suggest ideas.
Linguistic interpretation of speech errors
2016
The paper is an attempt to illustrate the linguistic interpretation of speech, known that it remains insufficiently resolved, especially for Romanian. The cause is given by the multitude of criteria that can or should be considered important in speech processing. The aim of this study is to develope a computational tool in order to identify the possible errors related to the morphosintactic structure of speech. Our goal is to assist users who can receive automatically different suggestions that can help them to improve the quality of their text. Thus, we chose an interdisciplinary approach through speech analysis that brings together the key fields of linguistics, computer science and so on…
An LP-based hyperparameter optimization model for language modeling
2018
In order to find hyperparameters for a machine learning model, algorithms such as grid search or random search are used over the space of possible values of the models hyperparameters. These search algorithms opt the solution that minimizes a specific cost function. In language models, perplexity is one of the most popular cost functions. In this study, we propose a fractional nonlinear programming model that finds the optimal perplexity value. The special structure of the model allows us to approximate it by a linear programming model that can be solved using the well-known simplex algorithm. To the best of our knowledge, this is the first attempt to use optimization techniques to find per…
Translingual text mining for identification of language pair phenomena
2016
Translingual Text Mining (TTM) is an innovative technology of natural language processing for building multilingual parallel corpora, processing machine translation, contextual knowledge acquisition, information extraction, query profiling, language modeling, contextual word sensing, creating feature test sets and for variety of other purposes. The Keynote Lecture will discuss opportunities and challenges of this computational technology. In particular, the focus will be made on identification of language pair phenomena and their applications to building holistic language model which is a novel tool for processing machine translation, supporting professional translations, evaluation of tran…
Metadata-Oriented Language Model in Translingual Retrieval of Digital Data
2015
Translingual retrieval relies on processing a source language to retrieve digital document content in a target language. From the perspective of successful browsing digital catalogues, probability of retrieving the full text document in a language other than the query language is close to zero owning to the fact that it is not only the library collection, but especially a problem of matching the index terms with the query keywords which are assumed to be their translation equivalents. In addition, hardly any digital library system is incorporated with a translation component. As a result, such a matching is rather coincidental. Our approach to the translingual document retrieval problem is …
Teaching GP to program like a human software developer
2019
Program synthesis is one of the relevant applications of GP with a strong impact on new fields such as genetic improvement. In order for synthesized code to be used in real-world software, the structure of the programs created by GP must be maintainable. We can teach GP how real-world software is built by learning the relevant properties of mined human-coded software - which can be easily accessed through repository hosting services such as GitHub. So combining program synthesis and repository mining is a logical step. In this paper, we analyze if GP can write programs with properties similar to code produced by human software developers. First, we compare the structure of functions generat…